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Abstract 

Chemical reaction networks (CRNs) formally model chemistry in a well-mixed solution. 

I CRNs are widely used to describe information processing occurring in natural cellular regulatory 

networks, and with upcoming advances in synthetic biology, CRNs are a promising language for 
the design of artificial molecular control circuitry. Nonetheless, despite the widespread use of 
CRNs in the natural sciences, the range of computational behaviors exhibited by CRNs is not 

j^" 1 well understood. 

v*' CRNs have been shown to be efficiently Turing-universal when allowing for a small proba- 

u bility of error. CRNs that are guaranteed to converge on a correct answer, on the other hand, 

have been shown to decide only the semilinear predicates. We introduce the notion of function, 
rather than predicate, computation by representing the output of a function / : N fe — > N' by a 
count of some molecular species, i.e., if the CRN starts with x\, . . . ,Xk molecules of some "in- 
put" species Xi,..., Xk, the CRN is guaranteed to converge to having f(x\, . . . , Xk) molecules 
of the "output" species Y± , . . . , Yi . We show that a function / : N fc — > N l is deterministically 
computed by a CRN if and only if its graph {(x, y) e N fe x N' | /(x) = y} is a semilinear set. 
I Finally, we show that each semilinear function / can be computed by a CRN on input x in 

expected time 0(polylog |jx||i). 

^ 1 Introduction 



> 

X 



The engineering of complex artificial molecular systems will require a sophisticated understanding 
of how to program chemistry. A natural language for describing the interactions of molecular species 
in a well- mixed solution is that of (finite) chemical reaction networks (CRNs), i.e., finite sets of 
chemical reactions such as A + B — > A + C. When the behavior of individual molecules is modeled, 
CRNs are assigned semantics through stochastic chemical kinetics [11], in which reactions occur 
probabilistically with rate proportional to the product of the molecular count of their reactants 
and inversely proportional to the volume of the reaction vessel. 

Traditionally CRNs have been used as a descriptive language to analyze naturally occurring 
chemical reactions (as well as numerous other systems with a large number of interacting compo- 
nents such as gene regulatory networks and animal populations). However, recent investigations 
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have viewed CRNs as a programming language for engineering artificial systems. These works have 
shown CRNs to have eclectic computational abilities. Researchers have investigated the power of 
CRNs to simulate Boolean circuits [15], neural networks [12], and digital signal processing [13]. 
Other work has shown that bounded-space Turing machines can be simulated with an arbitrarily 
small, non-zero probability of error by a CRN with only a polynomial slowdown [2]. 1 Even Turing 
universal computation is possible with an arbitrarily small, non-zero probability of error over all 
time [19]. The computational power of CRNs also provides insight on why it can be computation- 
ally difficult to simulate them [18], and why certain questions are frustatingly difficult to answer 
(e.g. undecidable) [10,21]. The programming approach to CRNs has also, in turn, resulted in novel 
insights regarding natural cellular regulatory networks [6]. 

Recent work proposes concrete chemical implementations of arbitrary CRNs, particularly using 
nucleic-acid strand-displacement cascades as the physical reaction primitive [7,20]. Thus, since 
in principle any CRN can be built, hypothetical CRNs with interesting behaviors are becoming 
of more than theoretical interest. One day artificial CRNs may underlie embedded controllers 
for biochemical, nanotechnological, or medical applications, where environments are inherently 
incompatible with traditional electronic controllers. 

One of the best-characterized computational abilities of CRNs is the deterministic computation 
of predicates as investigated by Angluin, Aspnes and Eisenstat [3]. (They considered an equivalent 
distributed computing model known as population protocols.) Some CRNs, when started in an 
initial configuration assigning nonnegative integer counts to each of k different input species, are 
guaranteed to converge on a single "yes" or "no" answer, in the sense that there are two special 
"voting" species L 1 and L° so that eventually either L 1 is present and L° absent to indicate "yes" , 
or vice versa to indicate "no." The set of inputs S C N fc that cause the system to answer "yes" is 
then a representation of the decision problem solved by the CRN. Angluin, Aspnes and Eisenstat 
showed that the input sets S decidable by some CRN are precisely the semilinear subsets of N fc 
(see below). 

We extend these prior investigations of decision problems or predicate computation to study 
deterministic function computation. Consider the three examples in Fig. l(top). These CRNs have 
the property that they converge to the right answer no matter the order in which the reactions 
happen to occur and are thus insensitive to stochastic effects as well as reaction rate constants. 
Formally, we say a function / : N k — > N l is computed by a CRN C if the following is true. There 
are "input" species Xi, . . . ,Xk and "output" species Y%, . . . ,Yi such that, if C is initialized with 
xi, . . . ,Xk copies of X\, . . . , X/,, then it is guaranteed to reach a configuration in which the counts 
of Yi, . . . , Yi are described by the vector f(x\, . . . , Xk), and these counts never again change. For 
example, the CRN C with the single reaction X —> 2Y computes the function f(x) = 2x in the 
sense that, if C starts in an initial configuration with x copies of X and copies of Y, then C is 
guaranteed to stabilize to a configuration with 2x copies of Y (and no copies of X). Similarly, the 
function f(x) = [x/2\ is computed by the single reaction 2X — > Y (Fig. 1(a)), in that the final 
configuration is guaranteed to have exactly [x/2\ copies of Y (and or 1 copies of X , depending 
on whether x is even or odd). 

It is illuminating to compare the computation of division by 2 shown in Fig. 1(a) with another 
reasonable alternative: reactions X — > Y and Y — > X (i.e. the reversible reaction X^Y). If the 
rate constants of the two reactions are equal, the system equilibrium is at half of the initial amount 

1 This is surprising since finite CRNs necessarily must represent binary data strings in a unary encoding, since 
they lack positional information to tell the difference between two molecules of the same species. 
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Figure 1: Examples of deterministically computable functions. (Top) Three functions and examples of CRNs 
deterministically computing them. The input is represented in the molecular count of X (for (a)), and moleculer 
counts of X\, X2 (for (b) and (c)). The output is represented by the molecular count of Y. Example (a) computes 
via the relative stoichiometry of reactants and products of a single reaction. In example (b), the second and third 
reactions convert B to Y and vice versa, catalyzed by Xi and B, respectively. Thus, if there are any X\ remaining 
after the first reaction finishes (and thus X\ > X2), all of B can get converted to Y permanently (since some B is 
required to convert Y back to B). Since in this case the first reaction produces X2 molecules of B, X2 molecules of the 
output Y are eventually produced. If the first reaction consumes all of Xi (and thus x\ < X2), eventually any Y that 
was produced in the second reaction gets converted to B by the third reaction. To see that the CRN in (c) correctly 
computes the maximum, note that the first two reactions eventually produce x\ + X2 molecules of Y, while the third 
reaction eventually produces min(a;i,a;2) molecules of K. Thus the last reaction eventually consumes min(:z;i, £2) 
molecules of Y leaving xi + X2 — min(xi,3;2) = maxfii,^) Y's. (Bottom) Graphs of the three functions. The set 
of points belonging to the graph of each of these functions is a semilinear set. Under each plot this semilinear set is 
written in the form of a union of linear sets corresponding to Equation 1.1. The defining vectors are shown as colored 
arrows in the graph. 

of X transformed to Y. There are two stark differences between this implementation and that of 
Fig. 1(a). First, this CRN would not have an exact output count of Y, but rather a distribution 
around the equilibrium. (However, in the limit of large numbers, the error as a fraction of the 
total would converge to zero.) The second difference is that the equilibrium amount of Y for 
any initial amount of X depends on the relative rate constants of the two reactions. In contrast, 
the deterministic computation discussed in this paper relies on the identity and stoichiometry 
of the reactants and products rather than the rate constants. While the rates of reactions are 
analog quantities, the identity and stoichiometry of the reactants and products are naturally digital. 
Methods for physically implementing CRNs naturally yield systems with digital stoichiometry that 
can be set exactly [7,20]. While rate constants can be tuned, being analog quantities, it cannot be 
expected that they can be controlled precisely. 

A few general properties of this type of deterministic computation can be inferred. The first 
property is that a deterministic CRN is able to handle input molecules added at any time, and not 
just initially. Otherwise, if the CRN could reach a state after which it no longer "accepts input" , 
then there would be a sequence of reactions that would lead to an incorrect output even if all input 
is present initially. This reaction sequence is one in which some input molecules remain unreacted 
while the CRNs goes to a state in which input is no longer accepted - which is always possible. 

The second general property of deterministic computation relates to composition. As any bona 
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fide computation must be composable, it is important to ask: can the output of one deterministic 
CRN be the input to another? This is more difficult than in standard computing since there is in 
general no way of knowing when a CRN is done computing, or whether it will change its answer 
in the future. This is essentially because a CRN cannot deterministically detect the absence of 
a species, and thus, for example, cannot discern when all input has been read. Moreover, simply 
concatenating two deterministic CRNs (renaming species to avoid conflict) does not always yield a 
deterministic CRN. For example, consider computing the function f{x\,X2j = [max(a?i, a?2)/2j by 
composing the CRNs in Fig. 1(c) and (a). The new CRN is: 

x 2 -> z 2 + w 

Z Y + Z 2 -> K 
K + W -»■ 
W + W -> Y 

where W is the output species of the max computation, that acts as the input to the division by 
2 computation. Note that if W happens to be converted to Y by the last reaction before it reacts 
with K, then the system can converge to a final output value of Y that is larger than expected. In 
other words, because the first CRN needs to consume its output W, the second CRN can interfere 
by consuming W itself (in the process of reading it out). 

Unlike in the above example, two deterministic CRNs can be simply concatenated to make 
a new deterministic CRN if the first CRN never consumes its output species (i.e. it produces its 
output "monotonically" ) . Since it doesn't matter when the input to the second CRN is produced 
(the first property, above), the overall computation will be correct. Yet deterministically computing 
a non-monotonic function without consuming output species is impossible because the CRN must 
be able to handle some input molecules reacting only after the output has already been produced 
(i.e. the first property, above). In a couple of places in this paper, we convert a non-monotonic 
function into a monotonic one over more outputs, to allow the result to be used by a downstream 
CRN (see below). 

What do the functions in Fig. l(top) have in common such that the CRNs computing them 
can inevitably progress to the right answer no matter what order the reactions occur in? What 
other functions can be computed similarly? Answering these questions may seem difficult because 
it appears like the three examples, although all deterministic, operate on different principles and 
seem to use different ideas. 

We show that the functions deterministically computable by CRNs are precisely the semilinear 
functions, where we define a function to be semilinear if its graph {(x, y) £ x N' | /(x) = y} is 
a semilinear subset of N fc xN f . (See Fig. 1 (bottom) for the graphs of the three example functions.) 
This means that the graph of the function is a union of a finite number of linear sets - i.e. sets 
that can be written in the form 

{ b + mm + . . . + n p VLp | m, . . . ,n p £ N } (1.1) 

for some fixed vectors b, ui, . . . , u p 6 p*f fc +'. Fig. 1 (bottom) shows the graphs of the three example 
functions expressed as a union of sets of this form. Informally, semilinear functions can be thought 
of as "piecewise linear functions" with a finite number of pieces, and linear domains of each piece. 2 

2 Semilinear sets have a number of characterizations. They are often thought of as generalizations of arithmetic 
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This characterization implies, for example, that such functions as f(x\,X2) = X\X2, f(x) = 
x 2 , or f(x) = 2 X are not deterministically computable. For instance, the graph of the function 
f(xx,X2) = x\X2 consists of infinitely many lines of different slopes, and thus, while each line 
is a linear set, the graph is not a finite union of linear sets. Our result employs the predicate 
computation characterization of Angluin, Aspnes and Eisenstat [3], together with some nontrivial 
additional technical machinery. 

While the example CRNs in Fig. 1 all seem to use different "tricks", in Section 4 we de- 
velop a systematic construction for any semilinear function. To get the gist of this construc- 
tion see the example in Fig. 2. To obtain a CRN computing the example semilinear function 
f(x\,X2) = max(2xi — ^2, ^2), we decompose the function into "linear" pieces: f\(x\,X2) = 2x\ — X2 
and f2{x\i%2) = %2 (formally partial affine functions, see Section 2). Then semilinear predicate 
computation (per Angluin, Aspnes and Eisenstat) is used to decide which linear function should 
be applied to a given input. A decomposition compatible with this approach is always possible by 
Lemma 4.4. Linear functions such as /1 and /jj are easy for CRNs to deterministically compute by 
the relative stoichiometry of the reactants and products (analogously to the example in Fig. 1(a)). 
However, note that to correctly compose the computation of fi with the downstream computa- 
tion (Fig. 1(b), right column) we convert fx from a non-monotonic function with one output, to a 
monotonic function with two outputs such that the original output is encoded by their difference. 

In the last part of this paper, we turn our attention to the time required for CRNs to converge 
to the answer. We show that every semilinear function can be deterministically computed on 
input x in expected time polylog(||x||). This is done by a similar technique used by Angluin, 
Aspnes, and Eisenstat [3] to show the equivalent result for predicate computation. They run 
a slow deterministic computation in parallel with a fast randomized computation, allowing the 
deterministic computation to compare the two answers and update the randomized answer only if 
it is incorrect, which happens with low probability. However, novel techniques are required since it 
is not as simple to "nondestructively compare" two integers (so that the counts are only changed 
if they are unequal) as to compare two Boolean values. 

2 Preliminaries 

Given a vector x G N fc , let ||x|| = ||x||i = ^i=i l x WI) where x(z) denotes the ith coordinate of x. 
A set A C N fc is linear if there exist vectors b, ui, . . . , u p G N fc such that 

A = { b + niui + . . . + n p u p \ n\, . . . , n p G N } . 

A is semilinear if it is a finite union of linear sets. If / : N fe — > N' is a function, define the graph of / 
to be the set { (x, y) £ M fc x N' | /(x) = y } . A function is semilinear if its graph is a semilinear 
set. 

We say a partial function / : N fc — > N is affine if there exist kl rational numbers an, . . . , a^ \ G 
Q and I + k nonnegative integers b\, . . . , hi, c\, . . . , c\. G N such that, if y = /(x), then for each 
j G {1, . • • , /}, y(j) = bj + Ya=i «i,j( x W - c »)i and for each i G {!>•••! k}, x(i) - Ci > 0. (In 

progressions. They are also exactly the sets that are definable in Presburger arithmetic [16]: the first-order theory of 
the natural numbers with addition. Equivalently, they are the sets accepted by boolean combinations of "modulo" and 
"threshold" predicates [3]. Semilinear functions are less well-studied. The "piecewise linear" intuitive characterization 
is formalized in Lemma 4.4. 
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/(x, , x 2 )= max(2x r x 2 , x 2 ) 




Any semilinear function can be piecewise defined in terms of linear functions, 
with the decision dictated by semilinear predicates (Lemma 4.4): 



/(xi,x 2 ) = 



/ 1 (xi,x 2 )=2x r x 2 ifx!<x 2 



/ 2 (x b x 2 )= x 2 



otherwise 



start with: (input) X\ , Xl , (initial context) 1 molecule of L° 
output: Y 

X\ ~^X\ + Xf + Xl make 3 copies of inputs for the 3 

x 2 ^xl+x 2 + xi 



x\ -> p 1 + p 1 
xl^c 1 

xl -> P 2 



Xl + L°- 



• 1} 



parallel computations 

- compute linear function /j(Xj, x 2 )= 2xpX 2 
compute linear function / 2 (X[, x 2 )= x 2 

- compute semilinear predicate "xj < x 2 ?" 



L 1 + P 1 -> L 1 + F + P 1 
L 1 + Y + P 2 ^L 1 + P 2 



L° + C* 1 -> L° + y + C* 1 
L° + F + P 2 ^ L° + P 2 

L° + P 2 -> L° + y + P 2 



if predicate true (ie L}): 

every P 1 increases Y and 
every C 1 decreases Y 

undo the effect ofL° 

if predicate false (ie L°): 
undo the effect ofL 1 

every P 2 increases Y 



Figure 2: An example capturing the essential elements of our systematic construction for computing semilinear 
functions (Lemma 4.3). To compute the target semilinear function, we recast it as a piecewise function defined in terms 
of linear functions, such that semilinear predicates can decide which of the linear functions is applicable for a given 
input (this recasting is possible by Lemma 4.4). (a) The graph of the target function visualizing the decomposition 
into linear functions, (b) A CRN deterministically computing the target function with intuitive explanations of 
the reactions. We use tri-molecular reactions for simplicity of exposition; however, these can be converted into a 
sequence of bimolecular reactions. Note that we allow an "initial context" : a fixed set of molecules that are always 
present in the initial state in addition to the input. The linear functions /i and fi are computed monotonically by 
representing the output as the difference of P ("produce") minus C ("consume") species. Thus although P 1 — C 1 
could be changing non-monotonically, P 1 and C 1 do not decrease over time, allowing them to be used as inputs for 
downstream computation. To compute the semilinear predicate u Xi < £2?", a single molecule, converted between L° 
and L 1 forms, goes back and forth consuming Xf and Xf. Whether it gets stuck in the L° or L 1 forms indicates 
the excess of X% or Jff . The reactions in the right column use the output of this predicate computation to set the 
count of Y (the global output) to either the value computed by /1 or fa. Note that the CRN cannot "know" when 
the predicate computation has finished since the absence Xi or X| cannot be detected. Thus the reactions in the 
right column must be capable of responding to a change in L°/L 1 . Species P 1 , P 2 , and C 1 are used to backup the 
values of P 1 , P 2 , and C 1 , enabling the switch in output. 



matrix notation, there exist a k x I rational matrix A and vectors b £N' and c € N fc such that 
/(x) = A(x — c) + b.) In other words, the graph of /, when projected onto the (k + l)-dimensional 
space defined by the k coordinates corresponding to x and the single coordinate corresponding to 
y(j), is a subset of a A;-dimensional hyperplane. 
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Four aspects of the definition of affine function invite explanation. 

First, we allow partial functions because Lemma 4.4 characterizes the semilinear functions as 
finite combinations of affine functions, where the union of the domains of the functions is the entire 
input space N fc . The value of an affine function on an input outside of its domain is irrelevant (and 
in fact may be non- integer). 

Second, we have two separate "constant offsets" bj and Cj. Affine functions over the reals are 
typically defined with only one of these, bj, where a function / : M. k — > M! is affine if there is a 
k x I real matrix A and vector b G 1R' such that /(x) = Ax + b. If instead real affine functions 
were defined as /(x) = A(x — c) + b, one could re-write this as /(x) = Ax — Ac + b and, letting 
b' = — Ac + b, obtain an affine function according to the former definition. However, if we take this 
approach in dealing with integers, it may be that while Ax — Ac + b is integer-valued, the terms 
Ax and —Ac + b are non-integer vectors, and when we compute affine functions with chemical 
reaction networks, these terms are handled separately by integer-valued counts of molecules. 

Third, it may seem overly restrictive to require bj and c% to be nonnegative. In fact, our proof 
of Lemma 4.2 is easily modified to show how to construct a CRN to compute an affine function 
that allows negative values for bj and Cj. However, Lemma 4.4 shows that, when the function is 
such that its graph is a nonnegative linear set, then we may freely assume that bj and q to be 
nonnegative. Since this simplifies some of our definitions, we use this convention, although it is 
not a strictly necessary assumption to prove computability of affine functions by chemical reaction 
networks. 

Fourth, the requirement that x(i) — q > seems artificial. When we prove that every semilinear 
function can be written as a finite union of partial affine functions with linear graphs (Lemma 4.4), 
however, this will follow from the fact that the "offset vector" in the definition of a linear set is 
required to be nonnegative. 

Note that by appropriate integer arithmetic, a partial function / : N fc — » N' is affine if and only 
if there exist kl integers ni t i, . . . , n^i G Z and 2l+k nonnegative integers b\, . . . , bi, c\, . . . , c&, d\, . . . , di G 
N such that, if y = /(x), then for each j G {1, . . . , I}, y(j) = bj + 4; Yli=i n M'( x W ~~ c i)i an d for 
each i G {1, . . . , k}, x(i) — a > 0. Each dj may be taken to be the least common multiple of the 
denominators of the rational coefficients in the original definition. We employ this latter definition, 
since it is more convenient for working with integer- valued molecular counts. 

2.1 Chemical reaction networks 

If A is a finite set (in this paper, of chemical species), we write N A to denote the set of functions 
/ : A — > N. Equivalently, we view an element c G N A as a vector of |A| nonnegative integers, with 
each coordinate "labeled" by an element of A. Given X G A and c G N A , we refer to c(X) as the 
count of X in c. We write c < c' to denote that c(X) < c'(X) for all X G A. Given c,c' G N A , 
we define the vector component-wise operations of addition c + c', subtraction c — c' , and scalar 
multiplication nc for n G N. If A C A, we view a vector c G N A equivalently as a vector c G N A by 
assuming c(X) = for all X G A \ A. 

Given a finite set of chemical species A, a reaction over A is a triple a = (r, p, k) G N A xN A xR + , 
specifying the stoichiometry of the reactants and products, respectively, and the rate constant k. If 
not specified, assume that k = 1 (this is the case for all reactions in this paper), so that the reaction 
a = (r, p, 1) is also represented by the pair (r, p) . For instance, given A = {A, B, C}, the reaction 
A + 2B — > A + 3C is the pair ((1, 2, 0), (1, 0, 3)) . A (finite) chemical reaction network ( CRN) is a 
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pair N = (A, R), where A is a finite set of chemical species, and R is a finite set of reactions over 
A. A configuration of a CRN N = (A, R) is a vector c G N A . We also write # C X to denote c(X), 
the count of species X in configuration c, or simply j^X when c is clear from context. 

Given a configuration c and reaction a = (r, p), we say that a is applicable to c if r < c (i.e., 
c contains enough of each of the reactants for the reaction to occur). If a is applicable to c, then 
write a(c) to denote the configuration c + p — r (i.e., the configuration that results from applying 
reaction a to c). If c' = a(c) for some reaction a G R, we write c — >n c', or merely c — > c' when 
N is clear from context. An execution (a.k.a., execution sequence) £ is a finite or infinite sequence 
of one or more configurations £ = (co, ci, C2, . . .) such that, for all i G {1, . . . , \£ \ — 1}, Cj_i — > c,. 
If a finite execution sequence starts with c and ends with c', we write c — >* N c' , or merely c — >•* c' 
when the CRN N is clear from context. In this case, we say that c' is reachable from c. 

Let A C A. We say that p G N A is a partial configuration (with respect to A). We write 
p = c [ A for any configuration c such that c(X) = p(X) for all X G A, and we say that p is 
the restriction of c to A. Say that a partial configuration p with respect to A is reachable from 
configuration c' if there is a configuration c reachable from c' and p = c \ A. In this case, we write 
c' ^* p. 

Turing machines, for example, have different semantic interpretations depending on the com- 
putational task under study (deciding a language, computing a function, etc.). Similarly, in this 
paper we use CRNs to decide subsets of N k and to compute functions / : N fc — > N'. In the next two 
subsections we define two semantic interpretations of CRNs that correspond to these two tasks. 

2.2 Stable decidability of predicates 

We now review the definition of stable decidability of predicates introduced by Angluin, Aspnes, 
and Eisenstat [3]. 3 Intuitively, some species "vote" for a yes/no answer and the system stabilizes 
to an output when a consensus is reached and it can no longer change its mind. The determinism 
of the system is captured in that it is impossible to stabilize to an incorrect answer, and the correct 
stable output is always reachable. 

A chemical reaction decider (CRD) is a tuple T> = (A,R,T,,T,4>,o~), where (A, R) is a CRN, 
X C A is the set of input species, T C A is the set of voters 4 , cj) : T — > {0, 1} is the (Boolean) output 
function, and a G N A \ S is the initial context. An input to T> will be a vector io G N s (equivalently, 
io G N fc if we write X = {Xi, . . . , X^} and assign Xi to represent the i'th coordinate). Thus a CRD 
together with an input vector defines an initial configuration i defined by i(X) = io(AT) if X G 2, 
and i(X) = o~(X) otherwise. We say that such a configuration is a valid initial configuration, i.e., 
i \ (A \ X) = a. If we are discussing a CRN understood from context to have a certain initial 
configuration i, we write #oX to denote i(X). 

We extend to a partial function $ : N A — » {0, 1} as follows, ^(c) is undefined if either 
c{X) = for all X G T, or if there exist X ,X X G T such that c(X ) > 0, c(X x ) > 0, (f>(X ) = 
and 0(Xi) = 1. Otherwise, there exists b G {0, 1} such that (VX G T)(c(X) > =>• cp(X) = b); 
in this case, the output <&(c) of configuration c is b. 

3 Those authors use the term "stably compute" , but we reserve the term "compute" to apply to the computation 
of non-Boolean functions. 

4 The definitions of [3] assume that T = A (i.e., every species votes). However, it is not hard to show that we 
may assume there are only two voting species, L° and L 1 , so that #L° > and jfL 1 = means that the CRD is 
answering "no", and #L° = and #1/ > means that the CRD is answering "yes." This convention will be more 
convenient in this paper. 
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A configuration c is output stable if $(c) is defined and, for all c' such that c — >* c' , ^(c') = ^(c). 
We say a CRD V stably decides the predicate tp : N s — > {0, 1} if, for any valid initial configuration 
i G N A with i f £ = io, for all configurations c G N A , i — >•* c implies c — >•* c' such that c' is 
output stable and ( l ) (c / ) = V'(io)- Note that this condition implies that no incorrect output stable 
configuration is reachable from i. We say that T> stably decides a set A G N fc if it stably decides its 
indicator function. 

The following theorem is due to Angluin, Aspenes, and Eisenstat [3]: 

Theorem 2.1 ( [3]). A set A C N fc is stably decidable by a CRD if and only if it is semilinear. 

The model they use is defined in a slightly different way. They study population protocols, a 
distributed computing model in which a fixed-size set of agents, each having a state from a finite 
set, undergo successive pairwise interactions, the two agents updating their states upon interacting. 
This is equivalent to chemical reaction networks in which all reactions have exactly two reactants 
and two products. 

In fact, the forward direction of Theorem 2.1 (every stably decidable set is semilinear) holds even 
if stable computation is defined with respect to any relation — >* on N fc that is reflexive, transitive, 
and "respects addition", i.e., [(Vci,c 2 ,x G N fc ) (ci ->■* c 2 ) ==>- (ci + x ->•* c 2 + x)]. These 
properties can easily be shown to hold for the CRN reachability relation. The third property, in 
particular, means that if some molecules ci can react to form molecules c 2 , then it is possible for 
them to react in the presence of some extra molecules x, such that no molecules from x react at 
all. 

2.3 Stable computation of functions 

We now define a notion of stable computation of functions similar to those above for predicates. 5 
Intuitively, the inputs to the function are the initial counts of input species X\, . . . ,X k , and the 
outputs are the counts of "output" species Y\,...,Yi. The system stabilizes to an output when 
the counts of the output species can no longer change. Again determinism is captured in that it is 
impossible to stabilize to an incorrect answer and the correct stable output is always reachable. 

Let k,l G Z + . A chemical reaction computer (CRC) is a tuple C = (A, R, E, T, a), where 
(A, R) is a CRN, S C A is the set of input species, T C A is the set of output species, such that 
S n r = 0, |E| = k, |T| = I, and a G N A \ S is the initial context. Write S = {X 1} X 2 , . . . ,X k } 
and r = {Y\, Y 2 , ■ ■ ■ ,Y{\ . We say that a configuration c is output count stable if, for every c' such 
that c —7-* c' and every Y{ G T, c(l^) = c'(Y^) (i.e., the counts of species in T will never change 
if c is reached). As with CRD's, we require initial configurations i of C with input io G N s to 
obey i(A) = io(A) if X G S and i(X) = o~(X) otherwise, calling them valid initial configura- 
tions. We say that C stably computes a function / : N k — > N l if for any valid initial configura- 
tion i G N A , i —?■* c implies c — >•* c' such that c' is an output count stable configuration with 
/(i(Xi),i(X 2 ), . . . ,\{X k )) = (c'(Y 1 ),c'(Y 2 ),...,c'(Yi)). Note that this condition implies that no 
incorrect output stable configuration is reachable from i. 

As an example of a formally defined CRC consider the function f(x) = [x/2\ shown in Fig. 1(a). 
This function is stably computed by the CRC (A, R, S, T, a) where (A, R) is the CRN consisting of 
a single reaction 2X — > Y, S = {X} is the set of input species, V = {Y} is the set of output species, 

The extension from Boolean predicates to functions described by Aspnes and Ruppert [4] applies only to finite- 
range functions, where one can choose |A| > \Y\ for output range Y. 
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and the initial context a is zero for all species in A \ S. In Fig. 1(b) the initial context o~(N) = 1, 
and is zero for all other species in in A \ S. In examples (a) and (b), there is at most one reaction 
that can happen in any reachable configuration. In contrast, different reactions may occur next in 
(c). However, from any reachable state, we can reach the output count stable configuration with 
the correct amount of Y, satisfying our definition of stable computation. 

In Sections 3 and 4 we will describe systematic (but much more complex) constructions for 
these and all functions with semilinear graphs. 

2.4 Fair execution sequences 

Note that by defining "deterministic" computation in terms of certain states being reachable and 
others not, we cannot guarantee the system will get to the correct output for any possible execution 
sequence. For example suppose an adversary controls the execution sequence. Then {X — > 2Y, A — > 
B,B —7- A} will not reach the intended output state y = 2x if the adversary simply does not let 
the first reaction occur, always preferring the second or third. 

Intuitively, in a real chemical mixture, the reactions are chosen randomly and not adversar- 
ially, and the CRN will get to the correct output. In this section we follow Angluin, Aspnes, 
and Eisenstat [3] and define a combinatorial condition called fairness on execution sequences that 
captures what is minimally required of the execution sequence to be guaranteed that a stably decid- 
ing/computing CRD/CRC will reach the output stable state. In the next section we consider the 
kinetic model, which ascribes probabilities to execution sequences. The kinetic model also defines 
the time of reactions, allowing us to study the computational complexity of the CRN computation. 
Note that in the kinetic model, if the reachable configuration space is bounded for any start con- 
figuration (i.e. if from any starting configuration there are finitely many configurations reachable) 
then any observed execution sequence will be fair with probability 1. (This will be the case for our 
construction in Section 4.) 

An infinite execution E = (cq, ci, C2, . . .) is fair if, for all partial configurations p, if p is infinitely 
often reachable then it is infinitely often reached. 6 In other words, no reachable partial configuration 
is "starved". 7 This definition, applied to finite executions, deems all of them fair vacuously. We 
wish to distinguish between finite executions that can be extended by applying another reaction 
and those that cannot. Say that a configuration is terminal if no reaction is applicable to it. We 
say that a finite execution is fair if and only if it ends in a terminal configuration. For any species 
A £ A, we write #oo^4 to denote the eventual convergent count of A if j^A is guaranteed to stabilize 
on any fair execution sequence; otherwise, #oo^4 is undefined. 

The next lemma characterizes stable computation of functions by CRCs in terms of fair execu- 
tion sequences, showing that the counts of output species will converge to the correct output values 
on any fair execution sequence. An analogous lemma holds for CRDs. 

Lemma 2.2. A CRC stably computes a function f : N fc — > N l if and only if for every valid 
6 i.e. (VA C A)(Vp G N A )[((3°°i G N) c* ->* p) =► ({3°°j G N) p = Cj \ A)]. 

7 This definition of fairness is stricter than that used in [3], which used only full configurations rather than par- 
tial configurations. We choose this definition to prevent intuitively unfair executions from vacuously satisfying the 
definition of "fair" simply because of some species whose count is monotonically increasing with time (preventing 
any configuration from being infinitely often reachable). Such a definition is unnecessary in [3] because population 
protocols by definition have a finite state space, since they enforce that every reaction has precisely two reactants 
and two products. 
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initial configuration i £ N , every fair execution £ = (i, Ci,C2, . . .) contains an output count stable 
configuration c such that /(i(Xi), i(A^), • • • , i(Xk)) = (c(li), c (^2) 5 • • • , c (^))- 

Proof. The "if" direction follows because every finite execution sequence can be extended to be 
fair, and thus an output count stable configuration with the correct output is always reachable. 
The "only if" direction is shown as follows. We know that from any reachable configuration c, 
some correct output stable configuration c' is reachable (but possibly different c' for different 
c). We'll argue that in any infinite fair execution sequence there is some partial configuration 
that is reachable infinitely often, and that any state with this partial configuration is the correct 
stable output state. Consider an infinite fair execution sequence cj.,C2, ... , and the corresponding 
reachable correct output stable configurations c^,c 2 , As in Lemma 11 of [3], there is some 
integer k > 1 such that a configuration is output count stable if and only if it is output count 
stable when each coordinate that is larger than k is set to exactly k (/c-truncation). The infinite 
sequence Ci,c' 2 , ... must have an infinite subsequence sharing the same k-truncation. Let p be 
the partial configuration consisting of the correct output and all the coordinates less than k in the 
shared truncation. This partial configuration is reachable infinitely often, and no matter what the 
counts of the other species outside of p are, the resulting configuration is output count stable. □ 



2.5 Kinetic model 

The following model of stochastic chemical kinetics is widely used in quantitative biology and other 
fields dealing with chemical reactions between species present in small counts [11]. It ascribes 
probabilities to execution sequences, and also defines the time of reactions, allowing us to study 
the computational complexity of the CRN computation in Section 4. 

In this paper, the rate constants of all reactions are 1, and we define the kinetic model with 
this assumption. A reaction is unimolecular if it has one reactant and bimolecular if it has two 
reactants. We use no higher-order reactions in this paper when using the kinetic model. 

The kinetics of a CRN is described by a continuous-time Markov process as follows. Given a 
fixed volume v and current configuration c, the propensity of a unimolecular reaction a : X — > . . . 
in configuration c is p(c, a) = # C X. The propensity of a bimolecular reaction a : X + Y — > . . ., 
where X ^ Y, is p(c, a) = # cX # cY . The propensity of a bimolecular reaction a : X + X — > . . . 

is p(c, a) = 2 — ^ • The propensity function determines the evolution of the system as 

follows. The time until the next reaction occurs is an exponential random variable with rate 
p(c) = X^ae-R Z'( c ' a ) ( n °t e that p(c) = if no reactions are applicable to c). The probability that 
next reaction will be a particular a nex t is "^ry 2 ^- 

The kinetic model is based on the physical assumption of well-mixedness valid in a dilute so- 
lution. Thus, we assume the finite density constraint, which stipulates that a volume required to 
execute a CRN must be proportional to the maximum molecular count obtained during execu- 
tion [19]. In other words, the total concentration (molecular count per volume) is bounded. This 
realistically constrains the speed of the computation achievable by CRNs. Note, however, that it 
is problematic to define the kinetic model for CRNs in which the reachable configuration space 
is unbounded for some start configurations, because this means that arbitrarily large molecular 
counts are reachable. 8 We apply the kinetic model only to CRNs with configuration spaces that 
are bounded for each start configuration. 

8 One possibility is to have a "dynamically" growing volume as in [19]. 
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3 Exactly the semilinear functions can be deterministically com- 
puted 

In this section we use Theorem 2.1 to show that only "simple" functions can be stably computed 
by CRCs. This is done by showing how to reduce the computation of a function by a CRC to the 
decidability of its graph by a CRD, and vice versa. In this section we do not concern ourselves with 
kinetics. Thus the volume is left unspecified, and we consider the combinatorial-only condition 
of fairness on execution sequences for our positive result (Lemma 3.2) and direct reachability 
arguments for the negative result (Lemma 3.1). 

The next lemma shows that every function computable by a chemical reaction network is semi- 
linear by reducing stably deciding a set that is the graph of a function to stably computing that 
function. It turns out that the reduction technique of introducing "production" and "consumption" 
indicator species will be a general technique, used repeatedly in this paper. 

Lemma 3.1. Every function stably computable by a CRC is semilinear. 

Proof. Suppose there is a CRC C stably computing /. We will construct a CRD V that stably 
decides the graph of /. By Theorem 2.1, this implies that the graph of / is semilinear. Intuitively, 
the difficulty lies in checking whether the amount of the outputs Yj produced by C matches the 
value given to the decider T> as input. What makes this non-trivial is that T> does not know whether 
C has finished computing, and thus must compare Yi while Yi is potentially being changed by C. 
In particular, T> cannot consume Yi or that could interfere with the operation of C. 

Let C = (A, R, S, T, a) be the CRC that stably computes / : N k — > W, with input species 
S = {X\, . . . , Xk} and output species V = {Yi, . . . , Y{\. Modify C to obtain the following CRD V = 
(A', R', T', cj)', a'). Let y c = {Yf, . . . , Yf} and y p = {Y p , . . . , Y p }, where each Yf, Y p A 
are new species. Intuitively, 4fY p represents the number of Yj's produced by C and #Yf the number 
of Yj's consumed by C. The goal is for V to stably decide the predicate f(#oXi, . . . , #o^fc) = 
(#oY 1 C > • • • ) ftoYi )■ I n other words, the initial configuration of T> will be the same as that of C 
except for some copies of Yf , equal to the purported output of / to be tested by V. 

Let A' = A U y c U y p U {L°, L 1 }. Let S' = SU y c . Let T' = {L°, L 1 }, with ^(L ) = 
and ^(L 1 ) = 1. Let a'(L l ) = 1 and a'(S) = for all S € A'\ (£' U {L 1 }). Modify R to 
obtain R' as follows. For each reaction a that consumes a net number n of Yi molecules, append 
n products Yf to a. For each reaction a that produces a net number n of Yi molecules, append 
n products Y p to a. For example, the reaction A + 2B + Y\ + 3Y3 — > Z + 3Yi + 2Y3 becomes 
A + 2B + Yi + 3Y 3 Z + 3Yi + 2Y 3 + 2Y/ 5 + Yf. 

Then add the following additional reactions to R' , for each i G {1, . . . , I}, 

Yf + Yf -> L 1 (3.1) 

Y p + L l -> Y p + L° (3.2) 

Yf + L l -> Yf + L (3.3) 

L° + L 1 -> L 1 (3.4) 

Observe that if f(#oXi, . . . , #oATfc) = (#oYf'\ ■ ■ ■ > #0^ ) 5 then from any reachable configura- 
tion we can reach a configuration without any Y^ 3 or Y^ for all i, and such that no more of either 
kind can be produced. (The CRC stabilizes and all of Yf and Yf is consumed by reaction 3.1.) In 
this configuration we must have ftL 1 > because the last instance of reaction 3.1 produced it (or if 
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no output was ever produced, L 1 comes from the initial context a'), and L 1 can no longer be con- 
sumed in reactions 3.2-3.3. Thus, since all of L° can be consumed in reaction 3.4, a configuration 
with f^L 1 > and #L° = is always reachable, and this configuration is output stable. 

Now suppose f(#oXi, . . . , #o^fc) 7^ (#oY-f , . . . , ) for some output coordinate i* 6 {1, . . . , /}. 

This means that from any reachable configuration we can reach a configuration with either #Yj£ > 
or #1^? > but not both, and such that for all i, no more of Yf and Yf can be produced. (This 
happens when the CRC stabilizes and reaction 3.1 consumes the smaller of Y^f or Y£ .) From this 
configuration, we can reach a configuration with #L° > and ftL 1 = through reactions 3.2-3.3. 
This is an output stable configuration since reactions 3.2-3.4 require L . □ 

The next lemma shows the converse of Lemma 3.1. Intuitively, it uses a random search of the 
output space to look for the correct answer to the function and uses a predicate decider to check 
whether the correct solution has been found. 

Lemma 3.2. Every semilinear function is stably computable by a CRC. 
Proof. Let / : N k — > N l be a semilinear function, and let 

F= { (x,y) GN fc xN l | /(x) = y } 

denote the graph of /. We then consider the set 

F = { (x,y P , y c ) E N k x x | /(x) = y P - y c } . 

Intuitively, F defines the same function as F, but with each output variable expressed as the 
difference between two other variables. Note that F is not the graph of a function since for each 
y E N' there are an infinite number of pairs (yp,yc) such that yp — yc = y. However, we only 
care that F is a semilinear set so long as F is a semilinear set, by Lemma 3.3, proven below. 
Then by Theorem 2.1, F is stably decidable by a CRD D = (A, R, S, T, cj), a), where 

S = {X\, . . . , Xk, Yf, . . . , YJ P , Y 1 , . . . , Yf}, 

and we assume that T contains only species L 1 and L° such that for any output-stable configuration 
of V, exactly one of or #L° is positive to indicate a yes or no answer, respectively. 

Define the CRC C = (A', Rl , £', V, a') as follows. Let £' = {X u X k }. Let V = {Y U ..., Y t }. 
Let A' = A U r. Let a'(S) = <r(S) for all S € A \ £, and let a'{S) = for all 5 £ A'\(A\ E). 
Intuitively, we will have L° change the value of y (by producing either Y^ or Y^ molecules), since 
presence indicates that T> has not yet decided that the predicate is satisfied. It essentially 
searches for new values of y that do satisfy the predicate. This indirect way of representing the 
value y is useful because yp and yc can both be increased monotonically to change y in either 
direction. If V had Yj as a species directly, and if we wanted to test a lower value of y^, then this 
would require consuming a copy of Yj , but this may not be possible if T> has already consumed all 
of them. 

Let R! be R plus the following reactions for each j £ {1, . . . , /}: 



L° -> L° + Yf + Yj (3.5) 
Yj L° + Yf 



L° + Yj L° + Yf (3.6) 
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It is clear that reactions (3.5) and (3.6) enforce that at any time, #Yj is equal to the total 
number of Y^'s produced by reaction (3.5) minus the total number of Yj's produced by reaction 
(3.6) (although some of each of Y? or Y_- may have been produced or consumed by other reactions 
in R). 

Suppose that /(x) / (#^i, • • • , jfYi). Then if there are no L° molecules present, the counts of 
Y? and Yj are not changed by reactions (3.5) and (3.6). Therefore only reactions in R proceed, and 
by the correctness of T>, eventually an L° molecule is produced (since eventually T> must reach an 
output-stable configuration answering "no" , although L° may appear before V reaches an output- 
stable configuration, if some L 1 are still present). Once L° is present, by the fairness condition 
(choosing A = {Y\, . . . .Y/}), eventually the value of • • • , #Y[) will change by reaction (3.5) 

or (3.6). In fact, every value of (#Y\, . . . , #YJ) is possible to explore by the fairness condition. 

Suppose then that /(x) = (#Yi, . . . , #Yi). Perhaps L° is present because the reactions in R have 
not yet reached an output-stable "yes" configuration. Then perhaps the value of (#Yi, . . . , #Yi) 
will change so that /(x) ^ {JfYx, . . . , #Yi). But by the fairness condition, a correct value of 
(jfYi, . . . , $Y{) must be present infinitely many times, so again by the fairness condition, since 
from such a configuration it is possible to eliminate all L° molecules before producing Y? or 
Yj molecules, this must eventually happen. When all L° molecules are gone while /(x) = 
(#Y"i, . . . , #V/) and V is in an output-stable configuration (thus no L° can ever again be pro- 
duced), then it is no longer possible to change the value of (#Yi, • • • , #Y[), whence C has reached 
a count-stable configuration with the correct answer. Therefore C stably computes /. □ 

Lemma 3.3. Let k,l G Z + , and suppose F C N fe x is semilinear. Define 

F = { (x,y P ,y c )eN fc xN'xN z | (xj P -y c )eF }. 
Then F is semilinear. 

Proof. Let F±, . . . , F t be linear sets such that F = Uj=i Fi- For each i £ {1, . . . , t}, define 
Fi = { (x, y P , y c ) E N fc x x | (x,y P - y c ) G F { } . 

It suffices to show that each F{ is linear since F = Ui=i Fi- Let i E {1, . . . , t} and let b, ui, . . . , u r G 
N fc x N' be such that 



Ft = <j b + Y.njUj 

3=1 



rij G N 



Define the vectors vi, . . . , v r G N fe x x N as Vj = (u 3 -,0 ). Here, 0' denotes the vector in N z 
consisting of all zeros. In other words, let Vj be Uj on its first k + I coordinates and on its last / 
coordinates. Similarly define b' = (b,0'). 

Also, for each j G define v r+i = (0 fc , J ' _1 10 f_J ', j - 1 10 l ~ j ). (i.e., a single 1 in the 

position corresponding to the jth output coordinate, one for yp and one for yc). Without the 
vectors v r+ j, the set of points defined by b', vi, . . . , v r would be simply Fi with I O's appended to 
the end of each vector. By adding the vectors v r+J -, for each (x, y) G Fi and each yp,yc G N ; 
such that y = yp — yc, we have that (x, yp,yc) = b' + X^=i n j v j f° r some m, . . . ,n r+ i G N; 
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in particular, for n\,.. . ,n r chosen such that (x, y) = b + Y^j=i n j u j an d n r+ j = yc(j) for each 

je{l,...J}. 

Thus Fi = | b' + Y^=l n j v j G N | , whence F{ is linear. □ 

Lemmas 3.1 and 3.2 immediately imply the following theorem. 

Theorem 3.4. A junction f : N k — > N l is stably computable by a CRC if and only if it is semilinear. 

One unsatisfactory aspect of Lemma 3.2 is that we do not reduce the computation of / directly 
to a CRD deciding the graph F of /, but rather to T> deciding a related set F . It is not clear 
how to directly reduce to a CRD deciding F since it is not obvious how to modify such a CRD to 
monotonically produce extra species that could be processed by the CRC computing /. Lemma 3.1, 
on the other hand, directly uses C as a black-box. Although we know that C, being a chemical 
reaction computer, is only capable of computing semilinear functions, if we imagine that some ex- 
ternal powerful "oracle" controlled the reactions of C to allow it to stably compute a non-semilinear 
function, then T> would decide that function's graph. Thus Lemma 3.1 is more like the black-box 
oracle Turing machine reductions employed in computability and complexity theory, which work 
no matter what mythical device is hypothesized to be responsible for answering the oracle queries. 



4 Semilinear functions can be quickly computed 

Lemma 3.2 describes how a CRC can deterministically compute any semilinear function. However, 
there are problems with this construction if we attempt to use it to evaluate the speed of semilinear 
function computation in the kinetic model. First, the configuration space is unbounded for any 
input since the construction searches over outputs without setting bounds. Thus, more care must 
be taken to ensure that any infinite execution sequence will be fair with probability 1 in the kinetic 
model. What is more, since the maximum molecular count is unbounded, it is not clear how to 
set the volume for the time analysis. Even if we attempt to properly define kinetics, it seems like 
any reasonable time analysis of the random search process will result in expected time at least 
exponential in the size of the output. 9 

For our asymptotic time analysis, let the input size n = ||x|| be the number of input molecules. 
In this section, the total molecular count attainable will always be O(n); thus, by finite density 
constraint, the volume v = O(n). 

We require the following theorem, due to Angluin, Aspnes, Diamadi, Fischer, and Rene [1], 
which states that any semilinear predicate can be decided by a CRD in expected time 0(n log n). 
(This was subsequently reduced to 0{n) by Angluin, Aspnes, and Eisenstat [2], but 0{n log n) 
suffices for our purpose.) 

Theorem 4.1 ( [1]). Let (ft : N fe — > {0, 1} be a semilinear predicate. Then there is a stable CRD T> 
that decides (ft, and the expected time to reach an output-stable state on input is 0{n log n). 

Throughout this section, we use the technique of "running multiple CRNs in parallel" on the 
same input. To accomplish this it is necessary to split the inputs X±, . . . , into separate molecules 
using a reaction Xj — > X\ + Xf + . • ■ + Xf , which will add only 0(log n) to the time complexity, so 

9 The random walk is biased downward because of the increasing propensities of the reactions consuming Yi's. 
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that each of the p separate parallel CRNs do not interfere with one another. For brevity we omit 
stating this formally when the technique is used. 

The next lemma shows that affine partial functions can be computed in expected time 0{n log n) 
by a CRC. For its use in proving Theorem 4.3, we require that the output molecules be produced 
monotonically. This is impossible for general affine partial functions. For example, consider the 
function f{x\,X2) = x\ — X2 where dom / = { (21,2:2) I x\ > X2 }■ By withholding a single copy 
of X2 and letting the CRC stabilize to the output value #Y = x\ — X2 + 1, then allowing the extra 
copy of X2 to interact, the only way to stabilize to the correct output value x\ — X2 is to consume 
a copy of the output species Y. Therefore Lemma 4.2 is stated in terms of an encoding of affine 
partial functions that allows monotonic production of outputs, encoding the output value y(j) as 
the difference between the counts of two monotonically produced species Yp and Yp , using the 
same technique used in the proofs of Lemmas 3.1 and 3.2. 

Let / : N fc — ► be an affine partial function, where, letting y = /(x), for all j G {1, ...,/}, 
y(j) = bj + j7 Yli=i n i,j( x W ~~ °i) f° r integer riij and nonnegative integer bj, Ci, and dj. Define 
/ : N fc --■ » N' x as follows. For each x £ dom /, define yc G N' for each j G {1, . . . , 1} as yc(j) = 
— 4; Y2i=i rnin{0, riij}(y:(i) — q). That is, yc(j) is the negation of the j'th coordinate of the output 
if taking the weighted sum of the inputs on only those coordinates with a negative coefficient riij . 
The value yp(j) is then similarly defined for all the positive coefficients and the bj offset: for each 
x G dom /, define yp G N' for each j G {1, . . . , 1} as yp(j') = bj + jr Yli=i max{0, rtij}(x(i) — a). 
Because x(i) — a > 0, yp and yc are always nonnegative. Then if y = /(x), we have that 
y = yp - yc- Define / as /(x) = (yp, yc)- 

Lemma 4.2. Let f : N k —•* N l be an affine partial function. Then there is a CRC that computes 
f : N fe --■» xN l in expected time 0{n\ogn), such that the output molecules monotonically increase 
with time (i.e. none are ever consumed), and at most 0(n) molecules are ever produced. 

Proof. If (yp, yc) = /( x ) ; then there exist kl integers n\^, . . . , n^ i £ Z and 2l + k nonnegative inte- 
gers 61, . . .,bi,cx, . . . ,c k ,dx, ■ ■ ■ ,di 6 N such that, if y = /(x), then for each j G {1, . . . ,/}, yc{j) = 
-J2i=i j] min{0,7iij}(x(i)-Ci) andyp(j) = bj+j: ^=i max {°) n i,i}( x W _c i)- Define the CRC as 
follows. It has input species E = {X±, . . . , X k } and output species T = {Yf, . . . , Y t p , Y±, . . . , Yp}. 

For each j G {1, . . . , I}, start with bj copies of Yp '. This accounts for the bj offsets. 

For each i G {1, . . . , k}, start with a single molecule C?, and for each m G {0, . . . , q — 1}, add 
the reactions 

Cf + ^ -> C7™ +1 (4.1) 
Cf+Xi C^+x; (4.2) 

This accounts for the q offsets by eventually producing x(i) — Cj copies of X^. Reaction (4.1) takes 
time 0(n) to complete because each reaction takes time at most 0{n) and a constant number, q, 
of such reactions must take place. Once is produced (hence there are now x(i) — Cj copies of 
Xi), reaction (4.2) takes time 0(n log n) to complete by a coupon collector argument. 
For each i G {1, . . . , k}, add the reaction 

X'. -> x^ + X^ + .-.+X,/ (4.3) 

This allows each output to be associated with its own copy of the input. Reaction (4.3) takes time 
O(logn) to complete. 
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For each i G {1, ...,&} and j G {1, . . . , /}, if rey > 0, add the reaction 

X itj -> nyZf (4.4) 



and if rtj < 0, add the reaction 



Reaction (4.4) produces dj(yp(j) — bj) copies of Zj", and reaction (4.5) produces dj~yc(j) copies of 
Zj. Each takes time O(logn) to complete. 

Finally, to produce the correct number of and Yl- output molecules, we must divide the 
count of each Z? and Z*? by dj. For each j G {1, . . . , /}, start with a single copy of a molecule 
-D^' P and another D®' . For each j G {1, ...,/} and each m G {0, ... ,dj — 1}, add the reactions 

d ™,p + Z p 



n m+l,P 

i ' 


if 


77?. 


< dj 


- 1 


j j ' 


if 


777. 


= dj 


- 1 




if 


777 


<dj 


- 1 


if 


777 


= dj 


- 1 



D ™,c + Z c ^ 

These reactions implement this division. By a coupon collector argument, they each require time 
0(n log re) to complete. □ 

The next lemma shows that every semilinear function / can be computed by a CRC in 0{n log re) 
time. It uses a systematic construction based on breaking down / into a finite number of partial 
affine functions fi, ■ ■ ■ , f m , in which deciding which fi to apply is itself a semilinear predicate. 
Intuitively, the construction proceeds by running many CRCs and CRDs in parallel on input x, 
computing all /j's and all predicates of the form fa = "x G dom /j?" The fa predicate computation 
is used to activate (in the case of a "yes" answer) or deactivate (in case of "no") the outputs of fi. 
Since eventually one CRD stabilizes to "yes" and the remainder to "no" , eventually the outputs of 
one fi are activated and the remainder deactivated, so that the value /(x) is properly computed. 

Lemma 4.3. Let f : N k — > N l be semilinear. Then there is a CRC C that stably computes f, and 
the expected time for C to reach a count-stable configuration on input x is 0(n log n) (where the 
O0 constant depends on f). 

Proof. By Lemma 3.2, there is a CRC C s that stably computes /. However, that CRC is too slow to 
use in this proof. We provide an alternative proof that every semilinear function can be computed 
by a CRC in expected time 0(n log re). Rather than relying on a random search of the output 
space as in Lemma 3.2, it computes the function more directly. Our CRC will have input species 
S = {X%, . . . , Xk} and output species V = {Yi, . . . , Y{\. 

By Lemma 4.4, there is a finite set F = {f± : N k — > N l , . . . , f m : N k — > N 1 } of affine partial 
functions, where each dom fi is a linear set, such that, for each x G N fc , if /i(x) is defined, then 
/(x) = /i(x). We compute / on input x as follows. Since each dom fi is a linear (and therefore 
semilinear) set, we compute each predicate fa = "x G dom fi and (W G{l,...,i — 1}) x dom /j/?" 
by separate parallel CRD's. (The latter condition ensures that for each x, precisely one of the 
predicates is true, in case the domains of the partial functions have nonempty intersection.) 
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By Lemma 4.2, we can compute each /j by parallel CRC's. Assume that for each i G {1, . . . , m} 
and each j £ {1, . . . , I}, the jth pair of outputs yp(j) and yc(j) of the ith function is represented 
by species Y^ and Y^. We interpret each Y^ and Yf< as an "inactive" version of "active" output 
species Y?- and Yf-. 

For each i £ {1, . . . , m}, we assume that the CRD computing the predicate 4>i represents its 
output by voting species L\ to represent "yes" and L\ to represent "no" . Then add the following 
reactions for each i £ {1, . . . , m} and each j 6 {1, . . . , I}: 

L} + Y^ ->■ L] + Y^ + Yj 
L° l+ Y,r -> L° + M„- 

(The latter two reactions implement the reverse direction of the first reaction using only bimolecular 
reactions.) Also add the reactions 

and 

y p -I- Y c - — ► K ■ 
Kj + Yj -> 

That is, a "yes" answer for function i activates the ith output and a "no" answer deactivates the 
ith output. Eventually each CRD stabilizes so that precisely one i has L\ present, and for all i' ^ i, 
L®, is present. At this point, all outputs for the correct function fi are activated and all other outputs 
are deactivated. The reactions enforce that at any time, #Yj = Ya^i + t^^.j- ^ n 

particular, jfcYj > #Kj and #Yj > jf=Mij at all times, so there will never be a Kj or Mj j molecule 
that cannot participate in the reaction of which it is a reactant. Eventually and #Yf< stabilize 
to to for all but one value of i (by the fifth reaction), and for this value of i, ifY^- stabilizes to 
y(j) and #Y^ stabilizes to (by the second-to-last reaction). Eventually #Kj stabilizes to by 

the last reaction. Eventually #Mij stabilizes to since L® is absent for the correct function /j. 
This ensures that #Yj stabilizes to y(j). 

It remains to analyze the expected time to stabilization. Let n = ||x||. By Lemma 4.2, the 
expected time for each affine function computation to complete is 0(n log n). Since the Yf- are 

produced monotonically, the most Y^ molecules that are ever produced is #oo^j- Since we have m 
computations in parallel, the expected time for all of them to complete is at most 0((n log n)m) = 
0(n log n) (since m depends on / but not n). We must also wait for each predicate computation 
to complete. By Theorem 2.1, each of these predicates takes expected time at most O(nlogn) to 
complete, so all of them complete in expected time at most 0{mn log n) = O(nlogn). 

At this point, the L\ leaders must convert inactive output species to active, and Lq (for i' ^ i) 
must convert active output species to inactive. A similar analysis to the proof of Lemma 4.3 shows 
that each of these requires at most 0(n log n) expected time, therefore they all complete in expected 
time at most 0((n log n)m) = 0{n log n). Finally, a similar argument shows that it requires at most 
expected time 0(nlogn) for the final two reactions to consume all Y i j and Kj molecules, at which 
point the system has stabilized. □ 
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Lemma 4.4. Let f : N k — > N' be a semilinear function. Then there is a finite set {f± : N fc --■> 
N', . . . , f m : N fc — ■> N 1 } of affine partial functions, where each dom fa is a linear set, such that, for 
each x G N k , if /i(x) is defined, then /(x) = /j(x), and (J^=i dom = N fc . 

We split the semilinear function into partial functions, each with a graph that is a linear set. 
The non-trivial aspect of our argument is showing that (straightforward) linear algebra can be used 
to solve our problem about integer arithmetic. For example, consider a partial function defined by 
the following linear graph: b = 0, ui = (1, 1, 1), 112 = (2,0, 1), 113 = (0,2, 1) (where the first two 
coordinates are inputs and the last coordinate is the output). Note that the set of points where 
this function is defined is where x\ + X2 is even. Given an input point x, the natural approach to 
evaluating the function is to solve for the coefficients ri\ , 712 , such that x can be expressed as a 
linear combination of ui, U2, U3 restricted to the first two coordinates. Then the linear combination 
of the last coordinate of ui, 112, U3 with coefficients m, ri2, TI3 would give the output. However, the 
vectors ui , U2 , U3 are not linearly independent (yet this linear set cannot be expressed with less than 
three basis vectors — illustrating the difference between real spaces and integer- valued linear sets) , 
so there are infinitely many real-valued solutions for the coefficients. We show that Uj must span 
a real subspace with at most one output value for any input coordinates. Then we can throw out 
a vector (say ui) to obtain a set of linearly independent vectors (112,113) and solve for n2,n% G R, 
and let n\ = 0. In this example, the resulting partial affine function is f(xi,X2) = {x\ + X2)/2. 

Proof. Let F = { (x, y) G N fc x N' | /(x) = y } be the graph of /. Since F is semilinear, it is 
a finite union of linear sets {L\, . . . , L n }. It suffices to show that each of these linear sets L m is 
the graph of an affine partial function. Since L m is linear, its projection onto any subset of its 
coordinates is linear. Therefore dom f m (the projection of L m onto its first k coordinates) is linear. 

We consider each output coordinate separately, since if we can show that each y(j) is an 
affine function of x, then it follows that y is an affine function of x. Fix j G {1, . . . , I}. Let L' m 
be the (k + l)-dimensional projection of L m onto the coordinates defined by x and y(j), which 
is linear because L m is. Since L' m is linear, there exist vectors b, ui,...,u p G N fc+1 such that 
L' m = { b + mui + . . . + n p u p I m, . . . , n p G N } . 

Consider the real- vector subspace spanned by ui,...,u p . It cannot contain the vector j = 
(0, . . . , 0, 1) T . Suppose it does. Take a subset of linearly independent vectors spanning this subspace 
from the above list (we possibly remove some linearly dependent vectors); say ui,...,u p /. The 
unique solution to the coefficients £1, . . . , £ p i 6l such that j = £1111 + . . . + £ p 'U p / can be obtained 
by using the left-inverse of the matrix with columns ui,...,u p / (the left inverse exists because 
the matrix is full-rank) . Since the elements of the left-inverse matrix are rational functions of the 
matrix elements, and vectors Ui, . . . , Uy consist of numbers in N, the coefficients £1, • • • are 
rational. We can multiply all the coefficients by the least common multiple of their denominators 
c yielding cj = miUi + . . . + m p iu p i where mi, . . . ,m p / £ Z. Now consider a point a in L' m 
defined as b + niUi + . . . + n p iu p >, where rij G N. We choose a such that n; are large enough that 
n! i = ni + mi > 0. Since n\ G N, we have that both a and a + cj = b + n^ui + . . . + n' ,u p / are in 
L' m . This is a contradiction because L' m is the graph of a partial function and cannot contain two 
different points that agree on their first k coordinates. Therefore j is not contained in the span of 
m, . . . ,Up. 

Consider again the real- vector subspace spanned by ui, . . . , u p . Again, let Ui, . . . , u p / be a subset 
of linearly independent vectors spanning this subspace. Since j is not in it, the subspace must be at 
most k dimensional. If it is strictly less than k dimensional, add enough vectors in N fc+1 to the basis 
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set for the spanned subspace to be exactly fc-dimensional but not include j. Call this new set of k 
linearly independent vectors wi, . . . , w^, where Wj = Uj for i £ {1, . . . , p'}. Let vi, . . . , v& E N k be 
wi, . . . , Wfc restricted to the first k coordinates. The fact that wi, . . . , are linearly independent, 
but j is not in the subspace spanned by them, implies that vi, ... , Vj. are linearly independent as 
well. This can be seen as follows. If Vi, . . . , v& were not linearly independent, then we could write 
Vfe = £i v i + • • • + £k-\Vk-i for some & 6 R. However, w fc / w' fe = ^wi + . . . +6;-l w fc-i- Since j is 
proportional to wl — w^, we obtain a contradiction. Therefore Vi, . . . , v*. are linearly independent. 

We now describe how to construct an affine function y(j) = /(x) for L' m from wi, . . . , w^. Let 
matrix V be the square matrix with vi,..., as columns. Let b' be b restricted to its first k 
coordinates. We claim that y(j) = h(k+l) + (wi(fc + 1), . . . , Wfc(fc + 1)) • V -1 • (x — b'). Below we'll 
show that this expression computes the correct value y(j). But first we show that it defines a partial 
affine function /(x). Because vi, . . . , are linearly independent, the inverse V -1 is well-defined. 
We need to show /(x) = bj + 4; J2i=i n y ( X W ~~ c ?) fo r integer riij and nonnegative integer bj, Ci, 
and dj, and that on the domain of /, x(i) — q > 0. The offset 6j = b(fe + l), which is a non-negative 
integer because b is a vector of non- negative integers. Since the offset vector b' is the same for each 
output dimension, and it is likewise non-negative, we obtain the offset a = h'(i). Further, since 
V -1 consists of rational elements (because V consists of elements in N), we can define dj and nij 
as needed. Finally, note that the least value of x(i) that could be in L' m is b'(i) = a, and thus on 
the domain of /, x(i) — a > 0. 

Finally, we show that this expression computes the correct value y(j). Let (£i, . . . , £k) T — 
V -1 • (x — b'), which implies that x = b' + Yli=i Ci v i- If our value of y(j) is incorrect, then 
3ni, . . . , rip G N such that b + ^?=i n i u i an d b + J2i=i agree on the first k coordinates but not 
on the k + 1st. Recall that the real- vector subspace spanned by Wi, . . . , includes the subspace 
spanned by ui, . . . , u p but does not include j. But Yli=i n i u i ~ S»=i ?« w « ^ s proportional to j and 
lies in the subspace spanned by wi, . . . , w^. Therefore we obtain a contradiction, implying that 
our value of y(j) is computed correctly. □ 

Angluin, Aspnes, and Eisenstat combined the slow, deterministic predicate-deciding results of [3] 
with a fast, error-prone simulation of a bounded-space Turing machine to show that semilinear 
predicates can be computed without error in expected polylogarithmic time [2]. We show that 
a similar technique implies that semilinear functions can be computed by CRNs without error in 
expected polylogarithmic time in the kinetic model, combining the same Turing machine simulation 
with our O(relogn) construction described in Lemma 4.3. 

We in fact use the same construction of Angluin, Aspnes, and Eisenstat [2] in order to conduct 
the fast, error-prone computation in our proof of Theorem 4.6. The next theorem formalizes the 
properties of their construction that we require. 

Theorem 4.5 ( [2]). Let f : N fc — > N' be a function by a t(m) -time-bounded, s(m) -space-bounded 
Turing machine, where m ~ logra is the input length in binary, and let c G N. Then there is a 
CRC C that computes f correctly with probability at least 1 — n~ c , and the expected time for C 
to reach a count-stable configuration is 0(t(m) 5 ). Furthermore, the total molecular count never 
exceeds 0(2 S ^). 

Semilinear functions on an m-bit input can be computed in time 0{m) and space 0{m) on 
a Turing machine. Therefore the bounds on CRC expected time and molecular count stated in 
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Theorem 4.5 are 0(log 5 n) and 0(n), respectively, when expressed in terms of the number of input 
molecules n. 

Although Angluin, Aspnes, and Eisenstat [2] exclusively use two-react ant, two-product reac- 
tions, and not all of the properties stated in Theorem 4.5 are explicitly stated in [2], their construc- 
tion can be easily modified to have the stated properties. Since that construction preserves the total 
molecular count, they require some non-uniformity to supply enough "fuel" molecules F, based on 
the space usage s(m) (which varies with the input size), so that the tape of the Turing machine can 
be accurately represented throughout the computation. However, in our model, molecules may be 
produced. We compute semilinear functions, where as observed above has total molecular count 
bounded by 0(n), so these fuels may be supplied by letting the first reaction of the input Xi be 
Xi — > X[ + cF, where X[ is the input interacting with the rest of the CRC, and c E N is chosen 
sufficiently large. 

The following theorem is the main theorem of this section. 

Theorem 4.6. Let f : N fc — > N l be semilinear. Then there is a CRC C that stably computes f , and 
the expected time for C to reach a count-stable configuration is 0(log 5 n). 

Proof. Our CRC will use the counts of Yj for each output dimension y (j) as the global output, and 
begins by running in parallel: 

1. A fast, error-prone CRC F for y,b, c = /(x), as in Theorem 4.5. For any constant c > 0, 
we may design F so that it is correct and finishes in time 0(log 5 n) with probability at least 
1 — n~ c , while reaching total molecular count never higher than 0(n). We modify F so that 
upon halting, it copies an "internal" output species Yj to Yj (the global output), Bj, and Cj 
through reactions H + Yj — >• Yj + Bj + Cj (in asymptotically negligible time). Here, H is 
some molecule that is guaranteed with high probability not to be present until F has halted, 
and to be present in large (f2(n)) count so that the conversion is fast. In this way we are 
guaranteed that the amount of Yj produced by C is the same as the amounts of Bj and Cj 
no matter whether its computation is correct or not. 

2. A slow, deterministic CRC S for y' = /(x). It is constructed as in Lemma 4.3, running in 
expected 0(n log n) time. 

3. A slow, deterministic CRD T> for the semilinear predicate "b = /(x)?". It is constructed as 
in Theorem 2.1 and runs in expected 0{n) time. 

Following Angluin, Aspnes, and Eisenstat [2], we construct a "timed trigger" as follows, using a 
leader molecule, a marker molecule, and n = ||x|| interfering molecules. These interfering molecules 
can simply be the input species and some of their "descendants" such that their count is held 
constant. This can be done for input species Xi by a reaction such as Xi — > I + X-, where Xi 
is (one of) the original input species, I is the interfering molecule, and X[ is the input species 
interacting with the remainder of the CRC. The leader will then interact with both Xi and / as 
interfering molecules. 

The leader fires the trigger if it encounters the marker molecule d times without any intervening 
reactions with the interfering molecules. This happens rarely enough that with high probability 
the trigger fires after F and T> finishes (time analysis is presented below). When the trigger fires, 
it checks if V is outputting a "no" (e.g. has a molecule of Lq), and if so, produces a molecule of 
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Pfi x . This indicates that the output of the fast CRC T is not to be trusted, and the system should 
switch from the possible erroneous result of T to the sure-to-be correct result of S. 

Once a Pfi x is produced, the system converts the output molecules Y- of the slow, deterministic 
CRC S to the global output Yj , and kills enough of the global output molecules to remove the ones 
produced by the fast, error-prone CRC: 

P fix + Y/ -> P fix + 1} (4.6) 
Pfix + C,- -> P fix + F j (4.7) 

y J+ F, -> 0. (4.8) 

Finally, Pfi x triggers a process consuming all species of T other than Yj, Bj, and Cj in expected 
O(logn) time so that afterward, T cannot produce any output molecules. More formally, let Qjr 
be the set of all species used by T . For all X G Qjr \ \J l j =1 {Yj, Bj, Cj}, add the reactions 

P fix + X -> Pfi x + K (4.9) 
K + X -»• K + K, (4.10) 

where K Qjr is a unique species. 

First, observe that the output will always eventually converge to the right answer, no matter 
what happens: If Pfi x is eventually produced, then the output will eventually be exactly that given 
by S which is guaranteed to converge correctly. If Pg x is never produced, then the fast, error-prone 
CRC must produce the correct amount of Yj — otherwise, V will detect a problem. 

For the expected time analysis, let us first analyze the trigger. The probability that the trigger 
leader will fire on any particular reaction number is at most rC d . In time n 2 , the expected number 
of leader reactions is 0(n 2 ). Thus, the expected number of firings of the trigger in n 2 time is 
n~ d+2 . This implies that the probability that the trigger fires before n 2 time is at most n~ d+2 . The 
expected time for the trigger to fire is 0(n d ). 

We now consider the contribution to the total expected time from 3 cases: 

1. J 7 is correct, and the trigger fires after time n 2 . There are two subcases: (a) T finishes before 
the trigger fires. Conditional on this, the whole system converges to the correct answer, never 
to change it again, in expected time 0(log 5 n). This subcase contributes at most 0(log 5 n) 
to the total expected time, (b) T finishes after the trigger fires. In this case, we may produce 
a Pfi x molecule and have to rely on the slow CRC S. The probability of this case happening 
is at most rT c . Conditional on this case, the expected time for the trigger to fire is still 
0(n d ). The whole system converges to the correct answer in expected time 0(n d ), because 
everything else is asymptotically negligible. Thus the contribution of this subcase to the total 
expectation is at most 0(n~ c ■ n d ) = 0{n~ c+d ). 

2. T is correct, but the trigger fires before n 2 time. In this case, we may produce a Pfi x molecule 
and have to rely on the slow CRC S for the output. The probability of this case occurring is 
at most n~ d+2 . Conditional on this case occurring, the expected time for the whole system 
to converge to the correct answer can be bounded by 0{n 2 ). Thus the contribution of this 
subcase to the total expectation is at most 0(n~ d+2 ■ n 2 ) = 0(n" d+A ). 
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3. T fails. In this case we'll have to rely on the slow CRC S for the output again. Since this 
occurs with probability at most n~ c , and the conditional expected time for the whole system 
to converge to the correct answer can be bounded by 0{n d ) again, the contribution of this 
subcase to the total expectation is at most 0(n~ c ■ n d ) = 0{n~ c+d ). 

So the total expected time is bounded by 0(log 5 n) + 0(n~ c+d ) + 0(n" d+4 ) + 0{n~ c+d ) = 
0(log 5 n) for d > 4, c> d. □ 



5 Conclusion 

We defined deterministic computation of CRNs corresponding to the intuitive notion that certain 
systems are guaranteed to converge to the correct answer no matter what order the reactions 
happen to occur in. We showed that this kind of computation corresponds exactly to the class 
of functions with semilinear graphs. We further showed that all functions in this class can be 
computed efficiently. 

A work on chemical computation can stumble by attempting to shoehorn an ill-fitting compu- 
tational paradigm into chemistry. While our systematic construction may seem complex, we are 
inspired by examples like those shown in Fig. 1 that appear to be good fits to the computational 
substrate. While delineation of computation that is "natural" for a chemical system is necessarily 
imprecise and speculative, it is examples such as these that makes us satisfied that we are studying 
a form of natural chemical computation. 

Our systematic construction (unlike the examples in Fig. 1) relies on a carefully chosen initial 
context — the "extra" molecules that are necessary for the computation to proceed. Some of these 
species need to be present in a single copy ("leader"). We left unanswered whether it may be 
possible to dispense with this level of control of the chemical environment. We suspect this gener- 
alization would be non-trivial because the problem of generating a prescribed molecular count of a 
species from an uncontrolled context is computationally challenging (see e.g. the "leader election" 
problem [4]). 

In contrast to the CRN model discussed in this paper, which is appropriate for small chemical 
systems in which every single molecule matters, classical "Avogadro-scale" chemistry is modeled 
using real-valued concentrations that evolve according to mass-action ODEs. Moreover, despite 
relatively small molecular counts, many biological chemical systems are well- modeled by mass- 
action ODEs. While the scaling of stochastic CRNs to mass-action systems is understood from 
a dynamical systems perspective [14], little work has been done comparing their computational 
abilities. There are hints that single/few- molecule CRNs perform a fundamentally different kind 
of computation. For example, recent theoretical work has investigated whether CRNs can tolerate 
multiple copies of the network running in parallel finding that they can lose their computational 
abilities [8,9]. 

Does our notion of deterministic computation have an equivalent in mass-action systems? Con- 
sider what happens when the CRN shown in Fig. 1(c) is considered as a mass-action reaction 
network, with (non-negative) real-valued inputs [A"i]o, [-X^o and output [Y]oc (where we use the 
standard mass-action convention: [-]q for the initial concentration, and [-]oo for the equilibrium con- 
centration). In the limit t — > oo, the mass-action system will converge to the correct output amount 
°f [^]oo = max ([-Xi]o> [A^Jo), and moreover, output amount is independent of what (non-zero) rate 
constants are assigned to the reactions. Thus one is tempted to connect the notion of deterministic 
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computation studied here and the property of robustness to parameters of a mass-action system. 
Parameter robustness is a recurring motif in biologically relevant reaction networks due to much 
evidence that biological systems tend to be robust to parameters [5]. 

However, the connection is not simple. Consider the CRN shown in Fig. 1(a). In the mass- 
action limit it loses the ability of computing the floor function, but still computes \Y]oo = [X]q/2 
for real valued [X]o, [l"]oo> independent of reaction rates. More interestingly, the CRN shown in 
Fig. 1(b), when considered as mass-action reaction network, could converge to a different amount 
of Y as t — > oo, depending on the rate constants of the last two reactions and the input amounts. 
Specifically, let fei, k 2 , and ks be the rate constants of the three reactions, respectively. If [X\]q > 
[X 2 ] and k 2 < fc3[X 2 ] /([Xi] - [X 2 ]q), then Y will go to k 2 /k 3 ([X l ] - [X 2 ]o) rather than [X 2 ]o 
as in Fig. 1(b). In all other cases, the output will correctly match the function in the figure. 
(This can be verified by determining the steady states of the system and then determining the 
stability of each one as a function of the initial concentrations and rate constants.) The cause of 
the disagreement between stochastic and mass-action instances of this CRN can be identified with 
the "type I" deviant effect demarcated by Samoilov and Arkin [17]. 
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